抱歉,您的浏览器无法访问本站

本页面需要浏览器支持(启用)JavaScript


了解详情 >

Apache Doris 简单介绍

YouTube介绍https://www.youtube.com/watch?v=0c2bt8gd6W8

Apache Doris编译

由于Apache Doris没有提供安装包,只有通过官方提供的SourceCode编译才能够进行接下来的安装,所以在本地Mac执行了编译过程,官方推荐使用Docker环境进行编译,官方提供了具有编译的环境的docker镜像,不推荐使用centos自行编译,坑很多,本人踩过后,然以失败告终。so

1
2
3
4
5
6
7
8
9
10
$ docker pull apachedoris/doris-dev:build-env-1.2
$ docker run -it -v /Users/wuxuan.chai/.m2:/root/.m2 -v /Users/wuxuan.chai/Documents/application/doris/:/root/incubator-doris-DORIS-0.13.0-release/ apachedoris/doris-dev:build-env-1.2
$ cd incubator-doris-DORIS-0.13.0-release & wget https://www.apache.org/dyn/closer.cgi?path=/incubator/doris/0.13.0-incubating/apache-doris-0.13.0-incubating-src.tar.gz
$ tar -zxvf apache-doris-0.13.0-incubating-src.tar.gz
# 安装java环境 PE依赖java,配置java环境变量$JAVA_HOME (必须要有jdk,因为后面编译fe的时候是通过maven编译,没有jdk,编译报错)
$ yum install java-1.8.0-openjdk.x86_64

# 执行编译Apache Doris脚本
$ cd incubator-doris-0.13.0-rc03
$ sh build.sh

耐心等待。。。。大概需要一个半小时。。。。

编译过程中maven依赖下载的时候会出错,需要改fe项目的pom文件

具体操作见:https://github.com/apache/incubator-doris/pull/4769/files#diff-ac3be7b9da003e43f78a9a5782e8a2967b476b2a63deb57faab013b2de12810c

这里将编译后的output文件夹上传到各个集群机器

机器准备

机器ip CPU 内存 磁盘 安装服务
10.200.50.71 8 16G 450G BE-01
10.200.50.72 8 16G 100G BE-02
10.200.50.73 8 16G 480G BE-03
10.200.50.74 8 8G 100G FE

fe安装

1、安装jdk >1.8,添加环境变量

2、将output中的fe包copy到适当目录,并做如下初始化操作

1
2
3
4
5
6
7
8
9
$ cd /usr/etc && copy -r ~/output/pe .
# doris-meta 是默认的数据存储位置,可在${DORIS_HOME}/conf/fe.conf 中设置
$ cd /usr/etc/pe/ && mkidr doris-meta
# 设置DORIS_HOME环境变量
$ cat /etc/profile export DORIS_HOME=/usr/etc/pe && source /etc/profile
# 启动doris-fe
$ sh /usr/etc/fe/bin/start_fe.sh --daemon
# 停止doris-fe
$ sh /usr/etc/fe/bin/stop_fe.sh

测试fe是否启动成功:

访问:

http://fe_ip:fe_http_port/api/bootstrap

1
2
3
4
5
6
7
{
"replayedJournalId": 0,
"queryPort": 0,
"rpcPort": 0,
"status": "OK",
"msg": "Success"
}

be安装

设置be的数据存储位置

1
2
3
4
5
6
$ cd /usr/etc && copy -r ~/output/pe .
# storage是数据存储的位置,可在${DORIS_HOME}/conf/be.conf 中设置
$ cd /usr/etc/be && mkdir storage
# 设置DORIS_HOME环境变量
$ cat /etc/profile export DORIS_HOME=/usr/etc/be && source /etc/profile

安装be端的mysql-client

1
2
$ rpm -ivh https://repo.mysql.com//mysql57-community-release-el7-11.noarch.rpm
$ yum install mysql-community-client.x86_64 -y

be端通过mysql-client进入fe端:

1
2
# port 为 pe/config/fe.conf 中的 query_port = 9030 
$ mysql -h 10.200.50.74 -P 9030 -uroot

添加be服务到fe:

1
2
3
4
# 添加be
$ ALTER SYSTEM ADD BACKEND "10.200.50.71:9050";
# 删除be:
$ ALTER SYSTEM dropp BACKEND "10.200.0.51:9050";

启动be:

1
$ cd /usr/etc/be/bin && sh start_be.sh --daemon

查看be是否启动成功

访问 http://fe_ip:fe_http_port/backend

查看当前安装的be的心跳是否被fe监听到

image-20210525101555447

也可以访问be端的监控页面查看:

访问:http://be-ip:webserver_port

image-20210525101856889

到此一个简单的Doris集群部署完成,其中包括一个fe节点和三个be节点,因为我不用从hdfs导入数据,所以没有部署fs_broker,如有需要在做介绍。

Doris使用

因为Doris是一个基于Mysql标准协议的MPP架构数据引擎,所以操作起来和TiDB、Mysql都一样。没有额外的学习成本。

进入fe服务,安装mysql-client(见be安装步骤)

1
2
3
4
5
# 在fe可以这么写
$ mysql -h 127.0.0.1 -P 9030
# 在别的机器,如果当前机器安装了mysql-client之后
$ mysql -h 10.200.50.74 (fe机器Ip) -P 9030

创建数据库

1
2
3
CREATE DATABASE wi;
SHOW DATABASES;
use wi;

建表操作

可以通过mysql-client 执行help create table查看建表的规范

1
help create table;

建测试表

建立一个名字为 table1 的逻辑表。分桶列为 siteid,桶数为 10。

这个表的 schema 如下:

  • siteid:类型是INT(4字节), 默认值为10
  • citycode:类型是SMALLINT(2字节)
  • username:类型是VARCHAR, 最大长度为32, 默认值为空字符串
  • pv:类型是BIGINT(8字节), 默认值是0; 这是一个指标列, Doris内部会对指标列做聚合操作, 这个列的聚合方法是求和(SUM)
1
2
3
4
5
6
7
8
9
10
CREATE TABLE table1
(
siteid INT DEFAULT '10',
citycode SMALLINT,
username VARCHAR(32) DEFAULT '',
pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(siteid, citycode, username)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");

复合分区

建立一个名字为 table2 的逻辑表。

这个表的 schema 如下:

  • event_day:类型是DATE,无默认值
  • siteid:类型是INT(4字节), 默认值为10
  • citycode:类型是SMALLINT(2字节)
  • username:类型是VARCHAR, 最大长度为32, 默认值为空字符串
  • pv:类型是BIGINT(8字节), 默认值是0; 这是一个指标列, Doris 内部会对指标列做聚合操作, 这个列的聚合方法是求和(SUM)

我们使用 event_day 列作为分区列,建立3个分区: p201706, p201707, p201708

  • p201706:范围为 [最小值, 2017-07-01)
  • p201707:范围为 [2017-07-01, 2017-08-01)
  • p201708:范围为 [2017-08-01, 2017-09-01)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CREATE TABLE table2
(
event_day DATE,
siteid INT DEFAULT '10',
citycode SMALLINT,
username VARCHAR(32) DEFAULT '',
pv BIGINT SUM DEFAULT '0'
)
AGGREGATE KEY(event_day, siteid, citycode, username)
PARTITION BY RANGE(event_day)
(
PARTITION p201706 VALUES LESS THAN ('2017-07-01'),
PARTITION p201707 VALUES LESS THAN ('2017-08-01'),
PARTITION p201708 VALUES LESS THAN ('2017-09-01')
)
DISTRIBUTED BY HASH(siteid) BUCKETS 10
PROPERTIES("replication_num" = "1");

注意事项:

  1. 上述表通过设置 replication_num 建的都是单副本的表,Doris建议用户采用默认的 3 副本设置,以保证高可用。
  2. 可以对复合分区表动态的增删分区。详见 HELP ALTER TABLE 中 Partition 相关部分。
  3. 数据导入可以导入指定的 Partition。详见 HELP LOAD
  4. 可以动态修改表的 Schema。
  5. 可以对 Table 增加上卷表(Rollup)以提高查询性能,这部分可以参见高级使用指南关于 Rollup 的描述。
  6. 表的列的Null属性默认为true,会对查询性能有一定的影响。

导入数据

以10.1.1.31机器中的bi_test_crs_0505的asset_record为例

将此表作为Doris的mysql外部表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
create table asset_reocrd_mysql (
`id` varchar(50) NOT NULL COMMENT '主键UUID',
`card_no` varchar(50) DEFAULT NULL COMMENT '卡号(用于会员),打印小票(用于普通顾客)',
`flow_type` tinyint(4) DEFAULT NULL COMMENT '流水类型: 1-卡资产流水 2: 托管单流水',
`merchant_id` varchar(50) DEFAULT NULL COMMENT '商户ID',
`merchant_name` varchar(200) DEFAULT NULL COMMENT '商户名称',
`shop_id` varchar(50) DEFAULT NULL COMMENT '门店ID',
`shop_code` varchar(50) DEFAULT NULL COMMENT '门店编码',
`shop_name` varchar(200) DEFAULT NULL COMMENT '门店名称',
`member_id` varchar(50) DEFAULT NULL COMMENT '会员ID',
`member_name` varchar(200) DEFAULT NULL COMMENT '会员姓名',
`id_number` varchar(50) DEFAULT NULL COMMENT '会员身份证',
`mobile_number` varchar(50) DEFAULT NULL COMMENT '会员手机号',
`member_level` varchar(50) DEFAULT NULL COMMENT '会员等级',
`member_level_name` varchar(50) DEFAULT NULL COMMENT '会员等级名称',
`member_type` tinyint(4) DEFAULT NULL COMMENT '会员类别 0-会员 1-非会员',
`ref_member_id` varchar(50) DEFAULT NULL COMMENT '引用会员ID',
`biz_code` varchar(50) DEFAULT NULL COMMENT '业务编码',
`amount` decimal(15,2) DEFAULT NULL COMMENT '数额',
`amount2` decimal(15,2) DEFAULT NULL COMMENT '数额2',
`asset_id` varchar(50) DEFAULT NULL COMMENT '资产ID',
`asset_type` tinyint(4) DEFAULT NULL COMMENT '资产类型:0查游戏币,1查彩票,2查银票卡,3查游戏券, 4押金',
`change_flag` tinyint(4) DEFAULT NULL COMMENT '资产变化:0: 增加可用资产, 1: 减少可用资产, 2:增加冻结资产, 3:减少冻结资产',
`order_code` varchar(50) DEFAULT NULL COMMENT '订单编码',
`param1` varchar(200) DEFAULT NULL COMMENT '参数1',
`param2` varchar(200) DEFAULT NULL COMMENT '参数2',
`param3` varchar(200) DEFAULT NULL COMMENT '参数3',
`parent_asset_id` varchar(50) DEFAULT NULL COMMENT '主资产id',
`create_time` datetime NULL COMMENT '创建时间',
`creator_id` varchar(50) DEFAULT NULL COMMENT '创建人ID',
`creator_name` varchar(200) DEFAULT NULL COMMENT '创建人名称',
`create_pos_id` varchar(50) DEFAULT NULL COMMENT '创建机台ID',
`create_pos_name` varchar(200) DEFAULT NULL COMMENT '创建机台名称',
`cashier_type` tinyint(4) DEFAULT NULL COMMENT '收银台类型:0.收银台,1.兑奖台,2.一体机,3.取币机,4.存币机',
`machine_category_id` varchar(50) DEFAULT NULL COMMENT '机台分类ID',
`machine_category_name` varchar(50) DEFAULT NULL COMMENT '机台分类名称',
`machine_id` varchar(50) DEFAULT NULL COMMENT '机台类型ID',
`machine_name` varchar(50) DEFAULT NULL COMMENT '机台类型名称',
`machine_instance_id` varchar(50) DEFAULT NULL COMMENT '机台实例ID',
`machine_instance_name` varchar(50) DEFAULT NULL COMMENT '机台实例名称',
`machine_instance_slot_id` varchar(50) DEFAULT NULL COMMENT '投币口ID',
`device_id` varchar(50) DEFAULT NULL COMMENT '刷卡器ID',
`device_name` varchar(50) DEFAULT NULL COMMENT '刷卡器名称',
`record_status` tinyint(4) DEFAULT NULL COMMENT '流水状态:0:正常,1:异常',
`record_seq` int(11) DEFAULT NULL COMMENT '同一操作产生的多条流水的流水序号',
`remark` varchar(5000) DEFAULT NULL COMMENT '备注',
`machine_instance_slot_no` int(11) DEFAULT NULL COMMENT '机台投币口序号',
`source_type` int(11) DEFAULT '0' COMMENT '数据来源:0-saas,1-迁移',
`ref_asset_id` varchar(50) DEFAULT NULL COMMENT '关联ID',
`ref_asset_type` int(11) DEFAULT NULL COMMENT '关联类型 (1 SVIP赠送)',
`business_model` varchar(50) DEFAULT NULL COMMENT '机台营业模式',
`item_drop_type` varchar(50) DEFAULT NULL COMMENT '物品掉落类型 none:无 coin:游戏币 lottery:飞豆 gift:礼品'
)
ENGINE=mysql
COMMENT '会员流水mysql外部表'
PROPERTIES(
"host" = "10.1.1.31",
"port" = "4001",
"user" = "test_allen",
"password" = "allen@97541",
"database" = "bi_test_crs_0505",
"table" = "asset_record"
)

创建OLAP引擎表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
CREATE TABLE wi.p_asset_record (
`id` VARCHAR ( 50 ) NOT NULL COMMENT '主键UUID',
`member_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '会员ID',
`card_no` VARCHAR ( 50 ) DEFAULT NULL COMMENT '卡号(用于会员),打印小票(用于普通顾客)',
`order_code` VARCHAR ( 50 ) DEFAULT NULL COMMENT '订单编码',
`shop_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '门店ID',
`ref_member_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '引用会员ID',
`create_time` datetime NOT NULL COMMENT '创建时间',
`flow_type` TINYINT ( 4 ) DEFAULT NULL COMMENT '流水类型: 1-卡资产流水 2: 托管单流水',
`merchant_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '商户ID',
`merchant_name` VARCHAR ( 200 ) DEFAULT NULL COMMENT '商户名称',
`shop_code` VARCHAR ( 50 ) DEFAULT NULL COMMENT '门店编码',
`shop_name` VARCHAR ( 200 ) DEFAULT NULL COMMENT '门店名称',
`member_name` VARCHAR ( 200 ) DEFAULT NULL COMMENT '会员姓名',
`id_number` VARCHAR ( 100 ) DEFAULT NULL COMMENT '会员身份证',
`mobile_number` VARCHAR ( 50 ) DEFAULT NULL COMMENT '会员手机号',
`member_level` VARCHAR ( 50 ) DEFAULT NULL COMMENT '会员等级',
`member_level_name` VARCHAR ( 50 ) DEFAULT NULL COMMENT '会员等级名称',
`member_type` TINYINT ( 4 ) DEFAULT NULL COMMENT '会员类别 0-会员 1-非会员',
`biz_code` VARCHAR ( 50 ) DEFAULT NULL COMMENT '业务编码',
`amount` DECIMAL ( 15, 2 ) DEFAULT NULL COMMENT '数额',
`amount2` DECIMAL ( 15, 2 ) DEFAULT NULL COMMENT '数额2',
`asset_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '资产ID',
`asset_type` TINYINT ( 4 ) DEFAULT NULL COMMENT '资产类型:0查游戏币,1查彩票,2查银票卡,3查游戏券, 4押金',
`change_flag` TINYINT ( 4 ) DEFAULT NULL COMMENT '资产变化:0: 增加可用资产, 1: 减少可用资产, 2:增加冻结资产, 3:减少冻结资产',
`param1` VARCHAR ( 200 ) DEFAULT NULL COMMENT '参数1',
`param2` VARCHAR ( 200 ) DEFAULT NULL COMMENT '参数2',
`param3` VARCHAR ( 200 ) DEFAULT NULL COMMENT '参数3',
`parent_asset_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '主资产id',
`creator_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '创建人ID',
`creator_name` VARCHAR ( 200 ) DEFAULT NULL COMMENT '创建人名称',
`create_pos_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '创建机台ID',
`create_pos_name` VARCHAR ( 200 ) DEFAULT NULL COMMENT '创建机台名称',
`cashier_type` TINYINT ( 4 ) DEFAULT NULL COMMENT '收银台类型:0.收银台,1.兑奖台,2.一体机,3.取币机,4.存币机',
`machine_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台类型ID',
`machine_category_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台分类ID',
`machine_category_name` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台分类名称',
`machine_name` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台类型名称',
`machine_instance_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台实例ID',
`machine_instance_name` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台实例名称',
`machine_instance_slot_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '投币口ID',
`device_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '刷卡器ID',
`device_name` VARCHAR ( 50 ) DEFAULT NULL COMMENT '刷卡器名称',
`record_status` TINYINT ( 4 ) DEFAULT NULL COMMENT '流水状态:0:正常,1:异常',
`record_seq` INT ( 11 ) DEFAULT NULL COMMENT '同一操作产生的多条流水的流水序号',
`remark` VARCHAR ( 5000 ) DEFAULT NULL COMMENT '备注',
`machine_instance_slot_no` INT ( 11 ) DEFAULT NULL COMMENT '机台投币口序号',
`source_type` INT ( 11 ) DEFAULT '0' COMMENT '数据来源:0-saas,1-迁移',
`ref_asset_id` VARCHAR ( 50 ) DEFAULT NULL COMMENT '关联ID',
`ref_asset_type` INT ( 11 ) DEFAULT NULL COMMENT '关联类型 (1 SVIP赠送)',
`business_model` VARCHAR ( 50 ) DEFAULT NULL COMMENT '机台营业模式',
`item_drop_type` VARCHAR ( 50 ) DEFAULT NULL COMMENT '物品掉落类型 none:无 coin:游戏币 lottery:飞豆 gift:礼品' ,
INDEX `idx_asset_record_2`(`member_id`) using BITMAP COMMENT 'member_id',
INDEX `idx_asset_record_3` (`card_no`) using BITMAP COMMENT 'member_id',
INDEX `idx_asset_record_4` (`order_code`) using BITMAP COMMENT 'member_id',
INDEX `idx_asset_record_6` (`ref_member_id`) using BITMAP COMMENT 'member_id',
INDEX `idx_asset_record_7` (`create_time`) using BITMAP COMMENT 'member_id',
)
ENGINE = olap
UNIQUE KEY ( `id`, `member_id`, `card_no`, `order_code`, `shop_id`, `ref_member_id`, `create_time` )
rollup ((`shop_id`,`create_time`))
PARTITION BY RANGE ( `create_time` ) (
PARTITION `p201701`
VALUES
LESS THAN ( "2017-02-01 00:00:00" ),
PARTITION `p201702`
VALUES
LESS THAN ( "2017-03-01 00:00:00" ),
PARTITION `p201703`
VALUES
LESS THAN ( "2017-04-01 00:00:00" ),
PARTITION `p201704`
VALUES
LESS THAN ( "2017-05-01 00:00:00" ),
PARTITION `p201705`
VALUES
LESS THAN ( "2017-06-01 00:00:00" ),
PARTITION `p201706`
VALUES
LESS THAN ( "2017-07-01 00:00:00" ),
PARTITION `p201707`
VALUES
LESS THAN ( "2017-08-01 00:00:00" ),
PARTITION `p201708`
VALUES
LESS THAN ( "2017-09-01 00:00:00" ),
PARTITION `p201709`
VALUES
LESS THAN ( "2017-10-01 00:00:00" ),
PARTITION `p201710`
VALUES
LESS THAN ( "2017-11-01 00:00:00" ),
PARTITION `p201711`
VALUES
LESS THAN ( "2017-12-01 00:00:00" ),
PARTITION `p201712`
VALUES
LESS THAN ( "2018-01-01 00:00:00" ),
PARTITION `p201801`
VALUES
LESS THAN ( "2018-02-01 00:00:00" ),
PARTITION `p201802`
VALUES
LESS THAN ( "2018-03-01 00:00:00" ),
PARTITION `p201803`
VALUES
LESS THAN ( "2018-04-01 00:00:00" ),
PARTITION `p201804`
VALUES
LESS THAN ( "2018-05-01 00:00:00" ),
PARTITION `p201805`
VALUES
LESS THAN ( "2018-06-01 00:00:00" ),
PARTITION `p201806`
VALUES
LESS THAN ( "2018-07-01 00:00:00" ),
PARTITION `p201807`
VALUES
LESS THAN ( "2018-08-01 00:00:00" ),
PARTITION `p201808`
VALUES
LESS THAN ( "2018-09-01 00:00:00" ),
PARTITION `p201809`
VALUES
LESS THAN ( "2018-10-01 00:00:00" ),
PARTITION `p201810`
VALUES
LESS THAN ( "2018-11-01 00:00:00" ),
PARTITION `p201811`
VALUES
LESS THAN ( "2018-12-01 00:00:00" ),
PARTITION `p201812`
VALUES
LESS THAN ( "2019-01-01 00:00:00" ),
PARTITION `p201901`
VALUES
LESS THAN ( "2019-02-01 00:00:00" ),
PARTITION `p201902`
VALUES
LESS THAN ( "2019-03-01 00:00:00" ),
PARTITION `p201903`
VALUES
LESS THAN ( "2019-04-01 00:00:00" ),
PARTITION `p201904`
VALUES
LESS THAN ( "2019-05-01 00:00:00" ),
PARTITION `p201905`
VALUES
LESS THAN ( "2019-06-01 00:00:00" ),
PARTITION `p201906`
VALUES
LESS THAN ( "2019-07-01 00:00:00" ),
PARTITION `p201907`
VALUES
LESS THAN ( "2019-08-01 00:00:00" ),
PARTITION `p201908`
VALUES
LESS THAN ( "2019-09-01 00:00:00" ),
PARTITION `p201909`
VALUES
LESS THAN ( "2019-10-01 00:00:00" ),
PARTITION `p201910`
VALUES
LESS THAN ( "2019-11-01 00:00:00" ),
PARTITION `p201911`
VALUES
LESS THAN ( "2019-12-01 00:00:00" ),
PARTITION `p201912`
VALUES
LESS THAN ( "2020-01-01 00:00:00" ),
PARTITION `p202001`
VALUES
LESS THAN ( "2020-02-01 00:00:00" ),
PARTITION `p202002`
VALUES
LESS THAN ( "2020-03-01 00:00:00" ),
PARTITION `p202003`
VALUES
LESS THAN ( "2020-04-01 00:00:00" ),
PARTITION `p202004`
VALUES
LESS THAN ( "2020-05-01 00:00:00" ),
PARTITION `p202005`
VALUES
LESS THAN ( "2020-06-01 00:00:00" ),
PARTITION `p202006`
VALUES
LESS THAN ( "2020-07-01 00:00:00" ),
PARTITION `p202007`
VALUES
LESS THAN ( "2020-08-01 00:00:00" ),
PARTITION `p202008`
VALUES
LESS THAN ( "2020-09-01 00:00:00" ),
PARTITION `p202009`
VALUES
LESS THAN ( "2020-10-01 00:00:00" ),
PARTITION `p202010`
VALUES
LESS THAN ( "2020-11-01 00:00:00" ),
PARTITION `p202011`
VALUES
LESS THAN ( "2020-12-01 00:00:00" ),
PARTITION `p202012`
VALUES
LESS THAN ( "2021-01-01 00:00:00" ),
PARTITION `p202101`
VALUES
LESS THAN ( "2021-02-01 00:00:00" ),
PARTITION `p202102`
VALUES
LESS THAN ( "2021-03-01 00:00:00" ),
PARTITION `p202103`
VALUES
LESS THAN ( "2021-04-01 00:00:00" ),
PARTITION `p202104`
VALUES
LESS THAN ( "2021-05-01 00:00:00" ),
PARTITION `p202105`
VALUES
LESS THAN ( "2021-06-01 00:00:00" ),
PARTITION `p202106`
VALUES
LESS THAN ( "2021-07-01 00:00:00" ),
PARTITION `p202107`
VALUES
LESS THAN ( "2021-08-01 00:00:00" ),
PARTITION `p202108`
VALUES
LESS THAN ( "2021-09-01 00:00:00" ),
PARTITION `p202109`
VALUES
LESS THAN ( "2021-10-01 00:00:00" ),
PARTITION `p202110`
VALUES
LESS THAN ( "2021-11-01 00:00:00" ),
PARTITION `p202111`
VALUES
LESS THAN ( "2021-12-01 00:00:00" ),
PARTITION `p202112`
VALUES
LESS THAN ( "2022-01-01 00:00:00" ),
PARTITION `p202201`
VALUES
LESS THAN ( "2022-02-01 00:00:00" ),
PARTITION `p202202`
VALUES
LESS THAN ( "2022-03-01 00:00:00" ),
PARTITION `p202203`
VALUES
LESS THAN ( "2022-04-01 00:00:00" ),
PARTITION `p202204`
VALUES
LESS THAN ( "2022-05-01 00:00:00" ),
PARTITION `p202205`
VALUES
LESS THAN ( "2022-06-01 00:00:00" ),
PARTITION `p202206`
VALUES
LESS THAN ( "2022-07-01 00:00:00" ),
PARTITION `p202207`
VALUES
LESS THAN ( "2022-08-01 00:00:00" ),
PARTITION `p202208`
VALUES
LESS THAN ( "2022-09-01 00:00:00" ),
PARTITION `p202209`
VALUES
LESS THAN ( "2022-10-01 00:00:00" ),
PARTITION `p202210`
VALUES
LESS THAN ( "2022-11-01 00:00:00" ),
PARTITION `p202211`
VALUES
LESS THAN ( "2022-12-01 00:00:00" ),
PARTITION `p202212`
VALUES
LESS THAN ( "2023-01-01 00:00:00" ),
PARTITION `p202301`
VALUES
LESS THAN ( "2023-02-01 00:00:00" ),
PARTITION `p202302`
VALUES
LESS THAN ( "2023-03-01 00:00:00" ),
PARTITION `p202303`
VALUES
LESS THAN ( "2023-04-01 00:00:00" ),
PARTITION `p202304`
VALUES
LESS THAN ( "2023-05-01 00:00:00" ),
PARTITION `p202305`
VALUES
LESS THAN ( "2023-06-01 00:00:00" ),
PARTITION `p202306`
VALUES
LESS THAN ( "2023-07-01 00:00:00" ),
PARTITION `p202307`
VALUES
LESS THAN ( "2023-08-01 00:00:00" ),
PARTITION `p202308`
VALUES
LESS THAN ( "2023-09-01 00:00:00" ),
PARTITION `p202309`
VALUES
LESS THAN ( "2023-10-01 00:00:00" ),
PARTITION `p202310`
VALUES
LESS THAN ( "2023-11-01 00:00:00" ),
PARTITION `p202311`
VALUES
LESS THAN ( "2023-12-01 00:00:00" ),
PARTITION `p202312`
VALUES
LESS THAN ( "2024-01-01 00:00:00" ),
PARTITION `p202401`
VALUES
LESS THAN ( "2024-02-01 00:00:00" ),
PARTITION `p202402`
VALUES
LESS THAN ( "2024-03-01 00:00:00" ),
PARTITION `p202403`
VALUES
LESS THAN ( "2024-04-01 00:00:00" ),
PARTITION `p202404`
VALUES
LESS THAN ( "2024-05-01 00:00:00" ),
PARTITION `p202405`
VALUES
LESS THAN ( "2024-06-01 00:00:00" ),
PARTITION `p202406`
VALUES
LESS THAN ( "2024-07-01 00:00:00" ),
PARTITION `p202407`
VALUES
LESS THAN ( "2024-08-01 00:00:00" ),
PARTITION `p202408`
VALUES
LESS THAN ( "2024-09-01 00:00:00" ),
PARTITION `p202409`
VALUES
LESS THAN ( "2024-10-01 00:00:00" ),
PARTITION `p202410`
VALUES
LESS THAN ( "2024-11-01 00:00:00" ),
PARTITION `p202411`
VALUES
LESS THAN ( "2024-12-01 00:00:00" ),
PARTITION `p202412`
VALUES
LESS THAN ( "2025-01-01 00:00:00" ),
PARTITION `p202501`
VALUES
LESS THAN ( "2025-02-01 00:00:00" ),
PARTITION `p202502`
VALUES
LESS THAN ( "2025-03-01 00:00:00" ),
PARTITION `p202503`
VALUES
LESS THAN ( "2025-04-01 00:00:00" ),
PARTITION `p202504`
VALUES
LESS THAN ( "2025-05-01 00:00:00" ),
PARTITION `p202505`
VALUES
LESS THAN ( "2025-06-01 00:00:00" ),
PARTITION `p202506`
VALUES
LESS THAN ( "2025-07-01 00:00:00" ),
PARTITION `p202507`
VALUES
LESS THAN ( "2025-08-01 00:00:00" ),
PARTITION `p202508`
VALUES
LESS THAN ( "2025-09-01 00:00:00" ),
PARTITION `p202509`
VALUES
LESS THAN ( "2025-10-01 00:00:00" ),
PARTITION `p202510`
VALUES
LESS THAN ( "2025-11-01 00:00:00" ),
PARTITION `p202511`
VALUES
LESS THAN ( "2025-12-01 00:00:00" ),
PARTITION `p202512`
VALUES
LESS THAN ( "2026-01-01 00:00:00" )
)
DISTRIBUTED BY HASH ( `id`, `member_id`, `card_no`, `order_code`, `shop_id`, `ref_member_id` ) BUCKETS 32
PROPERTIES ( "replication_num" = "1", "storage_type" = "column" );

注意:

Doris的建表非常的严格:

1、主键必须写在所有字段的前面

2、分区键必须是主键

3、分区键不支持表达式且没有MAXVALUE、MINVALUE这种方式的边界分区(后面测试会测试下,如果插入分区值大于分区条件的场景,确认下是否报错)

4、分区键的值format很严格,date和datetime不是一类,必须写全

insert query

1
2
3
insert into asset_record with LABEL label1  select * from asset_record_mysql limit 10;

-- 必须指定label

CSV 导入数据

参考文档

Stream load命令:

1
curl --location-trusted -u root -T asset_record_2021-05-01.csv  -XPUT http://10.200.50.74:8030/api/wi/asset_record/_stream_load

为了导入便利,借助ck的mysql引擎,做了愚下操作,不用Doris的mysql外部表的原因,性能不稳定。所以写了如下的shell:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/bin/bash
begin_date="20170306"
#end_date="20210507"
end_date="20210507"
while [ "$begin_date" -le "$end_date" ];
do
year=${begin_date:0:4}
month=$(date -d "$begin_date" +%m)
day=$(date -d "$begin_date" +%d)
current_date=$year-$month-$day
begin_date=$(date -d "${begin_date}+1days" +%Y%m%d)
n_year=${begin_date:0:4}
n_month=$(date -d "$begin_date" +%m)
n_day=$(date -d "$begin_date" +%d)
next_date=$n_year-$n_month-$n_day
echo "开始导出:$current_date 数据>>>"

# 打印导出命令,供参考
echo "开始执行:clickhouse-client -d mysql_database_31 --query \"select id,member_id,card_no,order_code,shop_id,ref_member_id,create_time,flow_type,merchant_id,merchant_name,shop_code,shop_name,member_name,id_number,mobile_number,member_level,member_level_name,member_type,biz_code,amount,amount2,asset_id,asset_type,change_flag,param1,param2,param3,parent_asset_id,creator_id,creator_name,create_pos_id,create_pos_name,cashier_type,machine_id,machine_category_id,machine_category_name,machine_name,machine_instance_id,machine_instance_name,machine_instance_slot_id,device_id,device_name,record_status,record_seq,remark,machine_instance_slot_no,source_type,ref_asset_id,ref_asset_type,business_model,item_drop_type from asset_record where create_time>= '$current_date' and create_time<'$next_date'\" > /home/shoppingmall/asset_record/asset_record_$current_date.csv"
# clickhouse 导出数据
clickhouse-client -d mysql_database_31 --query "select id,member_id,card_no,order_code,shop_id,ref_member_id,create_time,flow_type,merchant_id,merchant_name,shop_code,shop_name,member_name,id_number,mobile_number,member_level,member_level_name,member_type,biz_code,amount,amount2,asset_id,asset_type,change_flag,param1,param2,param3,parent_asset_id,creator_id,creator_name,create_pos_id,create_pos_name,cashier_type,machine_id,machine_category_id,machine_category_name,machine_name,machine_instance_id,machine_instance_name,machine_instance_slot_id,device_id,device_name,record_status,record_seq,remark,machine_instance_slot_no,source_type,ref_asset_id,ref_asset_type,business_model,item_drop_type from asset_record where create_time >= '$current_date' and create_time<'$next_date'" > /home/shoppingmall/asset_record/asset_record_$current_date.csv

# Strem Load 导入数据到doris
curl --location-trusted -u root: -T /home/shoppingmall/asset_record/asset_record_$current_date.csv -XPUT http://10.200.50.74:8030/api/wi/asset_record/_stream_load

# 删除csv文件
rm -rf /home/shoppingmall/asset_record/asset_record_$current_date.csv

echo "结束:$current_date 数据导出<<<"
done

查询场景测试

所有测试场景的过滤条件不变,日期粒度:天、月、三个月、六个月,租户id=dwj,shop_id动态传入

点查场景

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- dwj一家店铺一天的流水分页(20)
select * from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-06 00:00:00' and create_time <='2021-02-06 23:59:59' order by create_time desc limit 0,20;
-- 单查:0.66s-3.2s


-- dwj一家店铺一月的流水分页(20)
select * from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-01 00:00:00' and create_time <='2021-02-28 23:59:59' order by create_time desc limit 0,20;
-- 单查:1.18s-14s

-- dwj一家店铺三个月的流水分页(20)
select * from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-01-01 00:00:00' and create_time <='2021-03-31 23:59:59' order by create_time desc limit 0,20;
-- 单查:69s-30s

-- dwj一家店铺六个月的流水分页(20)
select * from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2020-10-01 00:00:00' and create_time <=
'2021-03-31 23:59:59' order by create_time desc limit 0,20;
-- 单查:1min43s - 24.19 sec

聚合场景

一个度量

一个维度
1
2
3
4
5
6
7
8
9
10
11
-- dwj一家店铺一天的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-06 00:00:00' and create_time <='2021-02-06 23:59:59' group by shop_id,create_time order by create_time desc limit 0,20;

-- dwj一家店铺一月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-01 00:00:00' and create_time <='2021-02-28 23:59:59' group by shop_id,create_time order by create_time desc limit 0,20;

-- dwj一家店铺三个月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-01-01 00:00:00' and create_time <='2021-03-31 23:59:59' group by shop_id,create_time order by create_time desc limit 0,20;

-- dwj一家店铺六个月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2020-10-01 00:00:00' and create_time <='2021-03-31 23:59:59' group by shop_id,create_time order by create_time desc limit 0,20;
二个维度
1
2
3
4
5
6
7
8
9
10
11
-- dwj一家店铺一天的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-06 00:00:00' and create_time <='2021-02-06 23:59:59' group by shop_id order by create_time desc limit 0,20;

-- dwj一家店铺一月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-02-01 00:00:00' and create_time <='2021-02-28 23:59:59' group by shop_id order by order by create_time desc limit 0,20;

-- dwj一家店铺三个月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2021-01-01 00:00:00' and create_time <='2021-03-31 23:59:59' group by shop_id order by create_time desc limit 0,20;

-- dwj一家店铺六个月的流水分页(20)
select shop_id,sum(amount) from asset_record where merchant_id = 'dwj' and shop_id = ? and create_time >= '2020-10-01 00:00:00' and create_time <='2021-03-31 23:59:59' group by shop_id order by create_time desc limit 0,20;
三个维度
四个维度

两个度量

关联查询场景

运维相关

升级fe

停掉fe服务

1
$ sh ${DORIS_HOME}/bin/stop_fe.sh

copy 之前fe目录中的doris-meta、conf(若配置文件有变更,除外)文件,迁移至心的fe包中,适当情况下对老的fe包进行备份

启动fe服务即可,如果保留之前的日志文件,copy旧的log目录到新的fe目录下

升级be

停掉be服务

1
$ sh ${DORIS_HOME}/bin/stop_be.sh

copy 之前be目录中的storage、conf(若配置文件有变更,除外)文件,迁移至心的be包中,适当情况下对老的be包进行备份

启动be服务即可,如果保留之前的日志文件,copy旧的log目录到新的be目录下

问题集合:

1、建表语句中日期字段无法指定当前时间作为默认值

1
`created` DATETIME REPLACE DEFAULT CURRENT_TIMESTAMP comment "创建时间"

推荐在insert的时候now()函数代替

其他的类型的默认值,有单引号 ''包起来,不然会报错

2、创建mysql外部表出现

1
ERROR 1064 (HY000): errCode = 2, detailMessage = Don't support MySQL table, you should rebuild Doris with WITH_MYSQL option ON

说明在编译Doris时没有指定WITH_MYSQL=1,

重新编译Doris

1
$ WITH_MYSQL=1 sh build.sh

重新更新be服务

3、合理设置storage_page_cache_limit,默认20G

防止查询将内存打满,导致服务器kill掉be的进程,我目前为了稳妥设置的是,服务器内存的一半。

4、索引

只支持bitmap索引,一个索引字段只能为一个字段,不能创建联合索引

rollup必须包含所有的联合主键字段

评论